Skip to content

Initial support for ext_oneapi_composite_device.#12178

Merged
steffenlarsen merged 33 commits intointel:syclfrom
maarquitos14:maronas/ext_composite_device
Feb 12, 2024
Merged

Initial support for ext_oneapi_composite_device.#12178
steffenlarsen merged 33 commits intointel:syclfrom
maarquitos14:maronas/ext_composite_device

Conversation

@maarquitos14
Copy link
Contributor

Initial implementation to support sycl_ext_oneapi_composite_device specified in #11846.

Depends on oneapi-src/unified-runtime#1192.

@maarquitos14 maarquitos14 requested review from a team as code owners December 14, 2023 17:03

set(LEVEL_ZERO_LOADER_REPO "https://github.com/oneapi-src/level-zero.git")
set(LEVEL_ZERO_LOADER_TAG v1.11.0)
set(LEVEL_ZERO_LOADER_TAG v1.15.1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I tested during development of this patch, I'd say this is not used. It seems that we're now using the equivalent in URT repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was made to make testing possible during development. It should be updated when the corresponding PR (oneapi-src/unified-runtime#1192) in URT is merged.

Comment on lines 572 to 573
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to hardcode that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we don't have to, but we know this extension only works for L0 backend, so we can save the call to PI just by checking this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this, but I can live with that. Please add a comment that this is just a performance optimization though. Same below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 8d68661.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to a question earlier, why do we have to know that in the SYCL RT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the comment is just to point out that only GPU architectures can be a composite device, and that's why we only get GPU devices. The Intel part is not really important, but it is informative. I'm open to removing it, if you think it's confusing or does not help.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - I don't like it but I can live with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 8d68661.

Maronas, Marcos added 3 commits December 15, 2023 07:34
Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com>
Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com>
Maronas, Marcos added 2 commits January 12, 2024 12:06
Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com>
Signed-off-by: Marcos Maronas <marcos.maronas@intel.com>
Signed-off-by: Marcos Maronas <marcos.maronas@intel.com>
Signed-off-by: Marcos Maronas <marcos.maronas@intel.com>
@maarquitos14
Copy link
Contributor Author

@aelovikov-intel I think I addressed all your concerns.

Comment on lines 59 to 65
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will have to go away before this can be formally approved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely, it is just to make sure the CI uses the correct UR version.

Comment on lines 22 to 23
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we have an std::hash<device> specialization, so I'd imagine std::set could work here. Feel free to ignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 8d68661.

Comment on lines 28 to 38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be an std::copy_if(Composites.begin(), Composites.end(), std::back_inserter{Result}, [](...) { /* predicate */ });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 8d68661.

Comment on lines 572 to 573
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this, but I can live with that. Please add a comment that this is just a performance optimization though. Same below.

sycl::device, ext::oneapi::experimental::info::device::composite_device> {
static sycl::device get(const DeviceImplPtr &Dev) {
if (Dev->getBackend() != backend::ext_oneapi_level_zero)
return {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not throwing an exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the extension specification:

The APIs may be called even when using other backends, but they will return an empty list of composite devices.

Comment on lines 76 to 77
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!IsL0 || !IsCombined)
assert(CompositeDevs.empty());
assert(CompositeDevs.empty() || (IsL0 && IsCombined));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 8d68661.

Comment on lines +80 to +83
if (std::find(CombinedCompositeDevs.begin(),
CombinedCompositeDevs.end(),
D) == CombinedCompositeDevs.end())
CombinedCompositeDevs.push_back(D);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this check? Devices across platforms can't be the same and I would not expect to have duplicate devices in P.ext_oneapi_get_composite_devices().

Copy link
Contributor Author

@maarquitos14 maarquitos14 Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're just checking that the following statement from extension spec holds for our implementation:

The free function get_composite_devices returns all of the composite devices across all platforms. The member function platform::ext_oneapi_get_composite_devices returns the composite devices within the given platform.

Particularly, we are checking that the free function returns the composite devices across all platforms, and we are not missing any due to a bug.

Comment on lines 89 to 90
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have a guarantee that the order must be the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! We don't in this case. We do have to guarantee that several calls to the same function must have the same order, but this is not the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 8d68661.

Comment on lines 103 to 104
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please combine into a single assert statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 8d68661.

Comment on lines 128 to 130
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very surprised to see this check for IsL0 in a test (it was just a performance optimization in the implementation)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 8d68661.

Marcos Maronas added 2 commits February 7, 2024 07:19
Signed-off-by: Marcos Maronas <marcos.maronas@intel.com>
@maarquitos14
Copy link
Contributor Author

@aarongreig conflicts resolved and UR repo and tag updated :)

Signed-off-by: Marcos Maronas <marcos.maronas@intel.com>
@aarongreig
Copy link
Contributor

@maarquitos14 we resolved the CI issues with #12658, that update also pulls in your changes for this PR so you should just be able to resolve the conflict and we can get this merged

Signed-off-by: Marcos Maronas <marcos.maronas@intel.com>
Signed-off-by: Marcos Maronas <marcos.maronas@intel.com>
@steffenlarsen steffenlarsen merged commit 2db1a4f into intel:sycl Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants

Comments